IT377 Machine Learning Application Practical List
Subject Cordinator: - Hemant Yadav
Subject Name: - Machine Learning & Applications
Semester: - 6
PRACTICAL LIST |
---|
Introduction to Python Programming. How python used in machine learning? Discuss python with Google Colab. |
Numpy
- Creating blank array, with predefined data, with pattern specific data - Slicing and Updating elements, - Shape manipulations - Looping over arrays. - Reading files in numpy - Use numpy vs list for matrix multiplication of 1000 X 1000 array and evaluate computing performance. For Help: https://www.dataquest.io/m/289-introduction-to-numpy/ https://cloudxlab.com/blog/numpy-pandas-introduction/ Pandas - Creating data frame - Reading files - Slicing manipulations - Exporting data to files - Columns and row manipulations with loops - Use pandas for masking data and reading if in Boolean format. For Help: https://www.hackerearth.com/practice/machine-learning/data-manipulation-visualisation-r-python/tutorial-data-manipulation-numpy-pandas-python/tutorial/ Matplotlib - Importing matplotlib - Simple line chart - Correlation chart - Histogram - Plotting of Multivariate data - Plot Pi Chart For Help: https://towardsdatascience.com/data-visualization-using-matplotlib-16f1aae5ce70 |
Linear Regression
Select Dataset of your choice and respond to following questions. - Why you want to apply regression on selected dataset? Discuss full story behind dataset. - How many total observations in data? - How many independent variables? - Which is dependent variable? - Which are most useful variable in estimation? Prove using correlation. - Implement linear regression using OLS method. - Implement linear regression using Gradient Descent from scratch. - Implement linear regression using sklearn API. - Quantify goodness of your model and discuss steps taken for improvement (RMSE, SSE, R2Score). - Discuss comparison of different methods. - Prepare presentation for this work in group of 5 For help: refer following free course on datacamp. Regression models: fitting them and evaluating their performance |
Two Class Classification (Logistic Regression)
Select Dataset of your choice and respond to following questions. - Why you want to apply classification on selected dataset? Discuss full story behind dataset. - How many total observations in data? - How many independent variables? - Which is dependent variable? - Which are most useful variable in classification? Prove using correlation. - Implement logistic function. - Implement Log-loss function. - Implement Logistic regression from scratch. - Implement Logistic regression using sklearn API. - Quantify goodness of your model and discuss steps taken for improvement (Accuracy, Confusion matrices, F-measure). - Discuss comparison of different methods. - Prepare presentation for this work in group of 5 For Help: 1. https://medium.com/@anishsingh20/logistic-regression-in-python-423c8d32838b 2. https://www.datacamp.com/community/tutorials/understanding-logistic-regression-python 3. https://towardsdatascience.com/logistic-regression-python-7c451928efee 4. https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8 5. https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html |
Multi Class Classification (KNN)
Select Dataset of your choice and respond to following questions. - Why you want to apply classification on selected dataset? Discuss full story behind dataset. - How many total observations in data? - How many independent variables? - Which is dependent variable? - Which are most useful variable in classification? Prove using correlation. - Implement KNN using sklearn api. - Implement code to find best value of k by splitting data in train and test. - Quantify goodness of your model and discuss steps taken for improvement. - Can we use KNN for regression also? Why / Why not? - Discuss drawbacks of algorithms such as KNN. - Prepare presentation for this work in group of 5 For Help: https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/ |
Comparative analysis of models using quantitative measures.
(F-measures, confusion Matrix, RMSE etc.). https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/ |
Find a dataset with number of samples smaller than number of features. Apply principle component analysis to select K best features.
Use Support Vector Machines/Naïve Bayes to train predictive model. Compare model accuracy and time required for training with full dataset and with selected K features. (use Sci-kit-learn library)
https://scikit-learn.org/stable/auto_examples/applications/plot_face_recognition.html https://www.dataquest.io/blog/sci-kit-learn-tutorial/ |
Perceptron algorithm for logic gates.
https://www.mldawn.com/train-a-perceptron-to-learn-the-and-gate-from-scratch-in-python/ |
Implement Convolutional neural network for hand written digits classification. Tune it and compare it with practical 8.
Apply Convolutional neural network on image classification data of your choice and write all steps for hyper parameter optimization. (use Keras library)
https://www.pyimagesearch.com/2018/04/16/keras-and-convolutional-neural-networks-cnns/ https://www.datacamp.com/community/tutorials/convolutional-neural-networks-python |
Use K-Means Clustering algorithm for clustering customer groups for optimizing product delivery.
https://towardsdatascience.com/machine-learning-algorithms-part-9-k-means-example-in-python-f2ad05ed5203 https://www.datacamp.com/community/tutorials/k-means-clustering-python |
Make a presentation on any one application currently you see in the market. Discuss technical, pros and cons, before after, and ongoing development in the same applications. |